On the Semiparametric Efficiency of the Scott-Wild Estimator under Choice-Based and Two-Phase Sampling

نویسنده

  • Alan Lee
چکیده

Suppose that for each of a number of subjects, we measure a response y and a vector of covariates x, in order to estimate the parameters β of a regression model which describes the conditional distribution of y given x. If we have sampled directly from the conditional distribution, or even the joint distribution, we can estimate β without knowledge of the distribution of the covariates. In the case of a discrete response, which takes one of J values y1, . . . , yJ , say, we often estimate β using a case-control sample, where we sample from the conditional distribution of X given Y = yj . This is particularly advantageous if some of the values yj occur with low probability. In case-control sampling, the likelihood involves the distribution of the covariates, which may be quite complex, and direct parametric modelling of this distribution may be too difficult. To get around this problem, the covariate distribution can be treated non-parametrically. In a series of papers, (Scott and Wild 1986, 1997, 2001, Wild 1991) Scott and Wild have developed an estimation technique which yields a semi-parametric estimate of β. They dealt with the unknown distribution of the covariates by profiling it out of the likelihood, and derived a set of estimating equations whose solution is the semi-parametric estimator of β. This technique also works well for more general sampling schemes, for example for two-phase outcome-dependent stratified sampling. Here, the sample space is partitioned into S disjoint strata which are defined completely by the values of the response and possibly some of the covariates. In the first phase of sampling, a prospective sample of size N is taken from the joint distribution of x and y, but only the stratum the individual belongs to is observed. In the second phase, for s = 1, . . . , S, a sample of size n 1 is selected from the n (s) 0 individuals in stratum s who were selected in the first phase, and the rest of the covariates are measured. Such a sampling scheme can reduce the cost of studies by confining the measurement of expensive variables to the

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On the Breslow-Holubkov estimator.

Breslow and Holubkov (J Roy Stat Soc B 59:447-461 1997a) developed semiparametric maximum likelihood estimation for two-phase studies with a case-control first phase under a logistic regression model and noted that, apart for the overall intercept term, it was the same as the semiparametric estimator for two-phase studies with a prospective first phase developed in Scott and Wild (Biometrica 84...

متن کامل

Generalized Ridge Regression Estimator in Semiparametric Regression Models

In the context of ridge regression, the estimation of ridge (shrinkage) parameter plays an important role in analyzing data. Many efforts have been put to develop skills and methods of computing shrinkage estimators for different full-parametric ridge regression approaches, using eigenvalues. However, the estimation of shrinkage parameter is neglected for semiparametric regression models. The m...

متن کامل

Kernel Ridge Estimator for the Partially Linear Model under Right-Censored Data

Objective: This paper aims to introduce a modified kernel-type ridge estimator for partially linear models under randomly-right censored data. Such models include two main issues that need to be solved: multi-collinearity and censorship. To address these issues, we improved the kernel estimator based on synthetic data transformation and kNN imputation techniques. The key idea of this paper is t...

متن کامل

Estimating Multiple Treatment Effects Using Two-phase Regression Estimators

We propose a semiparametric two-phase regression estimator with a semiparametric generalized propensity score estimator for estimating average treatment effects in the presence of informative first-phase sampling. The proposed estimator can be easily extended to any number of treatments and does not rely on a prespecified form of the response or outcome functions. The proposed estimator is show...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • JAMDS

دوره 2007  شماره 

صفحات  -

تاریخ انتشار 2007